Goto

Collaborating Authors

 data-driven method


Strategies to Minimize Out-of-Distribution Effects in Data-Driven MRS Quantification

arXiv.org Machine Learning

This study systematically compared data-driven and model-based strategies for metabolite quantification in magnetic resonance spectroscopy (MRS), focusing on resilience to out-of-distribution (OoD) effects and the balance between accuracy, robustness, and generalizability. A neural network designed for MRS quantification was trained using three distinct strategies: supervised regression, self-supervised learning, and test-time adaptation. These were compared against model-based fitting tools. Experiments combined large-scale simulated data, designed to probe metabolite concentration extrapolation and signal variability, with 1H single-voxel 7T in-vivo human brain spectra. In simulations, supervised learning achieved high accuracy for spectra similar to those in the training distribution, but showed marked degradation when extrapolated beyond the training distribution. Test-time adaptation proved more resilient to OoD effects, while self-supervised learning achieved intermediate performance. In-vivo experiments showed larger variance across the methods (data-driven and model-based) due to domain shift. Across all strategies, overlapping metabolites and baseline variability remained persistent challenges. While strong performance can be achieved by data-driven methods for MRS metabolite quantification, their reliability is contingent on careful consideration of the training distribution and potential OoD effects. When such conditions in the target distribution cannot be anticipated, test-time adaptation strategies ensure consistency between the quantification, the data, and the model, enabling reliable data-driven MRS pipelines.




A Single-Point Measurement Framework for Robust Cyber-Attack Diagnosis in Smart Microgrids Using Dual Fractional-Order Feature Analysis

arXiv.org Artificial Intelligence

Cyber-attacks jeopardize the safe operation of smart microgrids. At the same time, existing diagnostic methods either depend on expensive multi-point instrumentation or stringent modelling assumptions that are untenable under single-sensor constraints. This paper proposes a Fractional-Order Memory-Enhanced Attack-Diagnosis Scheme (FO-MADS) that achieves low-latency fault localisation and cyber-attack detection using only one VPQ (Voltage-Power-Reactive-power) sensor. FO-MADS first constructs a dual fractional-order feature library by jointly applying Caputo and Grünwald-Letnikov derivatives, thereby amplifying micro-perturbations and slow drifts in the VPQ signal. A two-stage hierarchical classifier then pinpoints the affected inverter and isolates the faulty IGBT switch, effectively alleviating class imbalance. Robustness is further strengthened through Progressive Memory-Replay Adversarial Training (PMR-AT), whose attack-aware loss is dynamically re-weighted via Online Hard Example Mining (OHEM) to prioritise the most challenging samples. Experiments on a four-inverter microgrid testbed comprising 1 normal and 24 fault classes under four attack scenarios demonstrate diagnostic accuracies of 96.6 % (bias), 94.0 % (noise), 92.8 % (data replacement), and 95.7 % (replay), while sustaining 96.7 % under attack-free conditions. These results establish FO-MADS as a cost-effective and readily deployable solution that markedly enhances the cyber-physical resilience of smart microgrids.


Tutorial on Using Machine Learning and Deep Learning Models for Mental Illness Detection

arXiv.org Artificial Intelligence

Author Note Correspondence concerning this article should be addressed to Yuchen Cao, Northeastern University, E-mail: cao.yuch@northeastern.edu Abstract Social media has become an important source for understanding mental health, providing researchers a way to detect conditions like depression from user-generated posts. This tutorial provides practical guidance to address common challenges in applying machine learning and deep learning methods for mental health detection on these platforms. It focuses on strategies for working with diverse datasets, improving text preprocessing, and addressing issues such as imbalanced data and model evaluation. Real-world examples and step-by-step instructions demonstrate how to apply these techniques effectively, with an emphasis on transparency, reproducibility, and ethical considerations. By sharing these approaches, this tutorial aims to help researchers build more reliable and widely applicable models for mental health research, contributing to better tools for early detection and intervention. Tutorial on Using Machine Learning and Deep Learning Models for Mental Illness Detection Introduction Mental health disorders, especially depression, have become a significant concern worldwide, affecting millions of individuals across diverse populations (Organization, 2020). Early detection of depression is crucial, as it can lead to timely treatment and better long-term outcomes. In today's digital age, social media platforms such as X(Twitter), Facebook, and Reddit provide a unique opportunity to study mental health. People often share their thoughts and emotions on these platforms, making them a valuable source for understanding mental health patterns (De Choudhury et al., 2013; Guntuku et al., 2017). Recent advances in computational methods, particularly machine learning (ML) and deep learning (DL), have shown promise in analyzing social media data to detect signs of depression. These techniques can uncover patterns in language use, emotions, and behaviors that may indicate mental health challenges (Shatte et al., 2020; Yazdavar et al., 2020).


MP-PINN: A Multi-Phase Physics-Informed Neural Network for Epidemic Forecasting

arXiv.org Artificial Intelligence

Forecasting temporal processes such as virus spreading in epidemics often requires more than just observed time-series data, especially at the beginning of a wave when data is limited. Traditional methods employ mechanistic models like the SIR family, which make strong assumptions about the underlying spreading process, often represented as a small set of compact differential equations. Data-driven methods such as deep neural networks make no such assumptions and can capture the generative process in more detail, but fail in long-term forecasting due to data limitations. We propose a new hybrid method called MP-PINN (Multi-Phase Physics-Informed Neural Network) to overcome the limitations of these two major approaches. MP-PINN instils the spreading mechanism into a neural network, enabling the mechanism to update in phases over time, reflecting the dynamics of the epidemics due to policy interventions. Experiments on COVID-19 waves demonstrate that MP-PINN achieves superior performance over pure data-driven or model-driven approaches for both short-term and long-term forecasting.


Knowledge-Driven Feature Selection and Engineering for Genotype Data with Large Language Models

arXiv.org Artificial Intelligence

Predicting phenotypes with complex genetic bases based on a small, interpretable set of variant features remains a challenging task. Conventionally, data-driven approaches are utilized for this task, yet the high dimensional nature of genotype data makes the analysis and prediction difficult. Motivated by the extensive knowledge encoded in pre-trained LLMs and their success in processing complex biomedical concepts, we set to examine the ability of LLMs in feature selection and engineering for tabular genotype data, with a novel knowledge-driven framework. We develop FREEFORM, Free-flow Reasoning and Ensembling for Enhanced Feature Output and Robust Modeling, designed with chain-of-thought and ensembling principles, to select and engineer features with the intrinsic knowledge of LLMs. Evaluated on two distinct genotype-phenotype datasets, genetic ancestry and hereditary hearing loss, we find this framework outperforms several data-driven methods, particularly on low-shot regimes. FREEFORM is available as open-source framework at GitHub: https://github.com/PennShenLab/FREEFORM.


A Soft Robotic System Automatically Learns Precise Agile Motions Without Model Information

arXiv.org Artificial Intelligence

Many application domains, e.g., in medicine and manufacturing, can greatly benefit from pneumatic Soft Robots (SRs). However, the accurate control of SRs has remained a significant challenge to date, mainly due to their nonlinear dynamics and viscoelastic material properties. Conventional control design methods often rely on either complex system modeling or time-intensive manual tuning, both of which require significant amounts of human expertise and thus limit their practicality. In recent works, the data-driven method, Automatic Neural ODE Control (ANODEC) has been successfully used to -- fully automatically and utilizing only input-output data -- design controllers for various nonlinear systems in silico, and without requiring prior model knowledge or extensive manual tuning. In this work, we successfully apply ANODEC to automatically learn to perform agile, non-repetitive reference tracking motion tasks in a real-world SR and within a finite time horizon. To the best of the authors' knowledge, ANODEC achieves, for the first time, performant control of a SR with hysteresis effects from only 30 seconds of input-output data and without any prior model knowledge. We show that for multiple, qualitatively different and even out-of-training-distribution reference signals, a single feedback controller designed by ANODEC outperforms a manually tuned PID baseline consistently. Overall, this contribution not only further strengthens the validity of ANODEC, but it marks an important step towards more practical, easy-to-use SRs that can automatically learn to perform agile motions from minimal experimental interaction time.


Data Poisoning: An Overlooked Threat to Power Grid Resilience

arXiv.org Artificial Intelligence

As the complexities of Dynamic Data Driven Applications Systems increase, preserving their resilience becomes more challenging. For instance, maintaining power grid resilience is becoming increasingly complicated due to the growing number of stochastic variables (such as renewable outputs) and extreme weather events that add uncertainty to the grid. Current optimization methods have struggled to accommodate this rise in complexity. This has fueled the growing interest in data-driven methods used to operate the grid, leading to more vulnerability to cyberattacks. One such disruption that is commonly discussed is the adversarial disruption, where the intruder attempts to add a small perturbation to input data in order to "manipulate" the system operation. During the last few years, work on adversarial training and disruptions on the power system has gained popularity. In this paper, we will first review these applications, specifically on the most common types of adversarial disruptions: evasion and poisoning disruptions. Through this review, we highlight the gap between poisoning and evasion research when applied to the power grid. This is due to the underlying assumption that model training is secure, leading to evasion disruptions being the primary type of studied disruption. Finally, we will examine the impacts of data poisoning interventions and showcase how they can endanger power grid resilience.


Sparse identification of quasipotentials via a combined data-driven method

arXiv.org Machine Learning

The quasipotential function allows for comprehension and prediction of the escape mechanisms from metastable states in nonlinear dynamical systems. This function acts as a natural extension of the potential function for non-gradient systems and it unveils important properties such as the maximum likelihood transition paths, transition rates and expected exit times of the system. Here, we leverage on machine learning via the combination of two data-driven techniques, namely a neural network and a sparse regression algorithm, to obtain symbolic expressions of quasipotential functions. The key idea is first to determine an orthogonal decomposition of the vector field that governs the underlying dynamics using neural networks, then to interpret symbolically the downhill and circulatory components of the decomposition. These functions are regressed simultaneously with the addition of mathematical constraints. We show that our approach discovers a parsimonious quasipotential equation for an archetypal model with a known exact quasipotential and for the dynamics of a nanomechanical resonator. The analytical forms deliver direct access to the stability of the metastable states and predict rare events with significant computational advantages. Our data-driven approach is of interest for a wide range of applications in which to assess the fluctuating dynamics.